logo

Table of Contents

What is version control

Here’s a super simple video that illustrates why we need version control. Git is used for version control.

It’s also important to use version control on non-collaborative projects. It allows for you to save code in order to compile and run, but also to be able to look back into the past for historic version, and essentially have an “undo” button in case you mess something up along the way.

So let’s say you’re working on a project. When it’s what you are working on, it is called (in version control language) a branch. You make changes to that branch, and you save these changes. When you have made enogh changes for that to be substantial, you commit these changes to your branch. Essentially this means that you’re creating an image of your project, as it is in this time. You can revert back to this at any time.

Your main branch that you’re working on is called the “master” branch. You can keep creating save points to the master branch, or if you get to a point where you like how it is, but then you want to create some feature for it, you can create a feature branch. Then you can work on the feature branch, or you can hire someone to work on the feature branch while you can continue to develop the master branch, and then when ready, you can merge that back in, creating a new save point.

GitHub is a social coding platform, it implements git

Terms

You can find a good beginner overview of references for the commands on how to do these here

There are a lot more (eg fast forward merge, rebase, 3-way merge, etc) but I’ll leave that for future explorations.

Looking at an example

So for example let’s look at a huge project, such as Hadley Wickam’s ggplot2 package for R

You can see the GitHub repository for it here

Here’s GitHub in all it’s glory.

So it opens on the code page, and you can see all the code that makes up this package here. You can see that this project is made up of a lot of elements, lots of code, but also you can see there is so much activity that goes into this. This is collaborative coding at its max potential really.

You can choose to watch it, and if you do you get notifications every time that something happens.

You can also ‘star’ it, which is like ‘heart’-ing something on twitter in that it serves two purposes:

  • Creating a bookmark for easier access
  • Showing appreciation to the repository maintainer for their work

You can also fork the repository.

This action takes the current version of all the code, and creates a copy as a repository in your own GitHub profile. This is how you take someone else’s code and build on it.

As mentioned, we are currently on the code tab. You can see that one of the tabs is labelled ‘Pull requests’. Basically, if you fork a repository, and make some changes, you can merge those changes back into the original source, through this request. This is often reviewed by the person who maintains the code.

For example here you can see someone who wrote a bit of code to give outliers to violin plots, and their code is currently being checked:

Another tab that is interesting is the ‘issues’ tab. This is where you can raise issues with the code if it’s something you use. If you have an unresolved issue with ggplot2, that you believe would require fixing, you could raise that here.

For example, this person has raised an issue about a false warning message using geom_bar():

And also there is the Wiki page for the project, where lots of detail can be found about it:

Overall you can see that there is a lot going on here. This is a very busy project with many (almost 100!) collaborators, and so it’s good to have a browse around it just to explore what sorts of contributions people are making, what communication is going on, how are the versions merged etc.

Using GitHub

OK let’s get into it. So to create a github account, you’ll need to sign up here.

Once you have your account, you are ready to start some version control! There are a few things that you can do directly on the github site. For example, you can set up an organisation. Since you guys want to be sharing your R code, you can set up an MMU R organisation.

To do so someone should create it

Thank you brave person.

OK so click on your name and in the dropdown menu select ‘create organisation’

Follow through the steps.

Add some members.

Create a repository.

Yay you now have a place to share all your code. This is one of the things you can do on the site.

However you want to be able to work on whatever you’re working on on your own pc. To do this, there are many approaches. You can directly connect R Studio to GitHub, which we’ll be doing in the next section.

The first step however is to set up Git. You can do that by following the steps here

It is quite common to use git from the command line. In a moment we’re going to set up the GitHub desktop GUI, but in general you should learn some basic commands. To do that go through the try github exercise.

But I am lazy, and also I don’t use GitHub as intricately as some do, so most of the time I use the GitHub desktop. You can download and set it up here. Some people really don’t like it. The main issues with it (I think) are around the limitations. Within the GitHub Desktop interface, there’s no way to browse existing issues in a repository, create new ones, or reply to existing items. Instead, you have to follow a link from within the program to the repository on the GitHub site. The same goes for repository wikis or graphs; you must go through the GitHub site to deal with them. However for my basic needs of essentially just archiving my work, it works amazingly.

Once you have set that up, you can either create a new repository, or you can fork something that’s already out there.

R Studio integration

With R Studio there is handy integration with GitHub, so you can pull commit and push all from R studio.

First, you will need to make sure that you have git installed on your computer.

Install Git

You need Git, so you can use it at the command line and so RStudio can call it. We’ve covered this in the previous section but in case you are joining in late, you can follow the detailed instructions here for installing git

Create new project (and git)

In RStudio, click New project as normal.

Click New Directory.

Name the project and check Create a git repository.

Now in RStudio, create a new script which you will add to your repository. After saving your new script (test.R), it should appear in the Git tab on the Environment / history panel.

You can commit the changes by ticking the boxes next to them under staged, entering some comments about the commit, and then clicking on the ‘Commit’ button.

Push

To sync your commits back to github, go to Tools> Shell…

This will open a comms shell, on Mac this will be your terminal.

Type the following

git remote add origin https://github.com/maczokni/rStudioTest.git
git config remote.origin.url git@github.com:maczokni/rStudioTest.git
git push -u origin master

If you’re on a mac, you also need:

git config --global credential.helper osxkeychain

Ta-daa! You have now pushed your commit to GitHub, and should be able to see your files in your GitHub account. The Pull Push buttons in RStudio will now also work. Remember, after each Commit, you have to Push to GitHub, this doesn’t happen automatically.

You can see more eg how to clone a repository into your R session here and here

Running a shiny app

If a shiny application is on github, you should be able to run it from there, without having to download the code.

You will need to load up shiny library

library(shiny)

And then all you need is runGitHub( " name of app ", " name of github account ") to run the app

For example, I have an app called “crimeMapTest” in my github, which you can run with the code below:

runGitHub( "crimeMapTest", "maczokni") 

Github pages

You can also use GitHub to host a page for your project, or to host a webpage for you.

For example, look at this one. You can give it a go with a very easy automatic page generator.

More details on what is github pages and how you can use it can be found here.

Super sneaky shortcut

It’s not actually that sneaky, but if you just want a HTML page to be displayed with github, there is a shortcut to doing this.

So for example, this markdown document I compile into a html page. I then upload it to my github. I can see it in my repository here:

Then click on the file. It should open it up and look something like this. Click on where it says “View raw”:

Now you should see something like this:

Take the URL and change https://raw.githubusercontent.com/... part to https://rawgit.com/... and put that in a web browser.

Ta-daa!